Spatial Singing Voice Synthesis

Spatial SVS(Singing Voice Synthesis) aims to produce expressive, high-quality singing voices enriched with accurate spatial cues, thereby enhancing listener immersion.

headphone

Please wear headphones to listen.

Demo

Text: 才二十三现在的日子没有那么简单

Phoneme Sequence: <AP>, c, ai, ai, ai, er, sh, i, s, an, <AP>, x, ian, z, ai, ai, ai, d, e, r, i, z, i, <AP>, m, m, ei, iou, n, a, m, e, j, j, ian, d, an, <SP> (<SP> represents silence segments, and <AP> breaths sound)

Spatial Prompt: [STATIC] Source locates at left-front up quadrant, and pauses in left-front up quadrant.

GT

Mono + SP

Rmssinger + SP

ISDrama(sing)

Text: 成都带不走的只有你

Phoneme Sequence: <SP>, ch, eng, d, u, <SP>, d, ai, b, u, z, ou, d, e, <SP>, zh, i, i, iou, iou, n, i, i, i, <SP> (<SP> represents silence segments, and <AP> breaths sound)

Spatial Prompt: [STATIC] Source locates at right-front up quadrant, and pauses in right-front up quadrant.

GT

Mono + SP

Rmssinger + SP

ISDrama(sing)

Text: 你可以不用记得我的好

Phoneme Sequence: <AP>, n, i, k, e, e, i, b, u, u, iong, j, i, i, d, e, uo, uo, d, e, h, ao (<SP> represents silence segments, and <AP> breaths sound)

Spatial Prompt: [STATIC] Source locates at right up quadrant, and pauses in right up quadrant.

GT

Mono + SP

Rmssinger + SP

ISDrama(sing)

Text: 穿过时间的缝隙它依然真实地

Phoneme Sequence: ch, uan, g, uo, sh, i, j, ian, ian, d, e, f, eng, x, i, <AP>, t, a, a, i, i, r, an, zh, en, sh, i, d, e (<SP> represents silence segments, and <AP> breaths sound)

Spatial Prompt: [STATIC] Source locates at front up quadrant, and pauses in front up quadrant.

GT

Mono + SP

Rmssinger + SP

ISDrama(sing)